home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Pascal Super Library
/
Pascal Super Library (CW International)(1997).bin
/
BBS_UTL
/
STLTH22
/
STEALTH.DOC
< prev
next >
Wrap
Text File
|
1992-02-09
|
62KB
|
1,824 lines
Stealth Bomber version 2.2 (formerly CRCSET)
Copyright (c) 1991, 1992 by Kevin Dean
Kevin Dean
Fairview Mall P.O. Box 55074
1800 Sheppard Avenue East
Willowdale, Ontario
CANADA M2J 5B9
CompuServe ID: 76336,3114
Contents
--------
Warranty .................................... 1
License ..................................... 2
Call For Programmers ........................ 3
Introduction ................................ 4
What is a CRC? .............................. 5
Other Types of Self-Checking ................ 9
Limitations of Self-Checking ................ 10
DOS Memory Control Blocks ................... 11
Stealth Viruses ............................. 13
How CRCSET.EXE Works ........................ 15
How to Use Stealth Bomber - C ............... 21
How to Use Stealth Bomber - Turbo Pascal .... 23
CRCSET.EXE Syntax and Messages .............. 25
Vulnerability ............................... 28
Warranty
The author of Stealth Bomber (hereafter referred to as "the author")
makes no warranty of any kind, expressed or implied, including without
limitation any warranties of merchantability and/or fitness for a particular
purpose. The author shall not be liable for any damages, whether direct,
indirect, special, or consequential arising from a failure of this program to
operate in the manner desired by the user. The author shall not be liable for
any damage to data or property which may be caused directly or indirectly by
use of the program.
In no event will the author be liable to the user for any damages,
including any lost profits, lost savings, or other incidental or consequential
damages arising out of the use or inability to use the program, or for any
claim by any other party.
- Page 1 -
License
This program is public domain. As such, it may be freely distributed
by anyone by any means. No person or organization may charge for this program
except for a minimal charge to cover handling and distribution.
A quiet option ('-q') is available principally for developers; see
the section "CRCSET.EXE Syntax and Messages" for details. If you wish to
distribute Stealth Bomber as part of the installation procedure for your
package, you are free to do so without obligation under the condition that the
executable CRCSET.EXE itself is the only part of this package that you
distribute. You may not distribute CRCSET.EXE as part of a generic INSTALL
package; if you wish to do so, contact me at the address on the first page of
this document.
Having said that, I would like to add that this algorithm has taken a
lot of time and work to develop. If you like this program, send me a postcard
and let me know. I would also be interested in copies of your own programs if
you feel that that is appropriate.
Also, if you have any questions or would like to see some more
features in the program, drop me a note by surface or electronic mail (my
address is on the first page of this file). I will answer all mail regarding
this program.
- Page 2 -
Call For Programmers
I have had requests for support for other languages, most notably
QBASIC. If you are willing to port this code to another language (other than
C, C++, or Turbo Pascal), I will gladly add your code and your name to the
package. Contact me at the address on the first page for details.
- Page 3 -
Introduction
Stealth Bomber is an anti-virus utility. It performs simple checks by
looking for suspicious behaviour in the operating system and checks files with
one of the most effective weapons against computer viruses: the Cyclic
Redundancy Check, or CRC. A full understanding of the DOS internals and the
CRC is not required to use this utility; if you like, you can skip over the
discussions of the CRC and DOS memory control blocks to the sections entitled
"How to Use Stealth Bomber - <language>".
There are many utilities that perform CRC checks on other programs but
most of these are external programs that are usually run only once, if at all.
The CRC generated by these utilities must be compared to a value in an
external file; if the values match, the program is not infected.
This approach has two problems: the first is that the CRC check is
run only once when the user gets the program, if at all. Most people would
never run the check a second time. The second problem is that the CRC is
stored in an external file (e.g. the documentation). If someone wants to tack
a virus onto the program, it becomes a simple matter to run the validation
program, copy the CRC values to the documentation, and distribute the infected
program. Anyone running the validation program would find the same CRC in the
program as in the documentation, and in comes the virus.
Another (increasingly popular) approach is for the CRC to be stored in
the program itself (the .EXE file) and for the program to do its own check
every time it is loaded. This method is much more effective than the previous
one because it means that the moment the program is infected and the CRC
changes, the infection will be detected the next time the program is run.
There is a potential problem with this method, but before I get into
that, we need some background.
- Page 4 -
What is a CRC?
The CRC, or Cyclic Redundancy Check, is an error-checking algorithm
used in many types of computer operations, especially in data transfer. For
example, whenever your computer writes to disk, the disk controller calculates
the CRC of the data being written and writes it with the data. If your disk
should somehow become corrupted, the CRC check will fail when you next try to
read the data and the disk controller will return with an error, forcing DOS
to display the critical error "Data error reading drive C:". Most file
transfer protocols (like Xmodem, Zmodem, and some derivatives of Kermit) also
use a CRC to validate the data being transmitted: if the CRC transmitted with
the data does not match the CRC calculated by the receiving program, then the
transmission has failed and the sending program is asked to retry the
transmission.
The actual calculation of the CRC is very simple. The algorithm
consists of two parts, a CRC polynomial and a CRC register, and is really an
exercise in modulo-2 arithmetic. The rules for modulo-2 arithmetic are shown
in the following table:
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0
For those of you familiar with binary logic, these rules are equivalent to
the exclusive-or operation. Note: under modulo-2 arithmetic, subtraction is
equivalent to addition.
There is nothing magical about modulo-2 arithmetic and it has no
special properties that make it better suited to CRC calculations than
standard arithmetic; rather, since modulo-2 arithmetic doesn't carry from one
column to the next (i.e. 1 + 1 = 0 with no carry), it's just easier to
implement in hardware and software than any other method. Consider the
following:
1. Binary addition, normal rules of carry
101101001
+ 110110111
-----------
1100100000
2. Binary addition, modulo-2 arithmetic (no carry)
101101001
+ 110110111
-----------
011011110
The first addition requires us to carry any overflow from right to left. The
second addition requires no carry operations and can be performed much faster
both by humans and by computers.
The CRC algorithm can best be illustrated by the following diagram of
a 4-bit CRC generator:
- Page 5 -
CRC polynomial
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| x |<- + <-| x |<------| x |<- + <-| x |<- +
^ ----- ----- ----- -----
| CRC register
---- binary data stream
Each '+' symbol represents modulo-2 addition. The numbers above the CRC
register are the bit numbers of the register.
The CRC is calculated as follows:
1. Initialize the CRC register to 0.
2. Add the incoming bit of the data stream to the outgoing bit (bit 3) of the
CRC register.
3. Send the result of step 2 into the polynomial feedback loop.
4. Add the value in the feedback loop to the bits in the CRC register as it is
shifted left. The bits affected are determined by the CRC polynomial (i.e.
there is an addition for every bit in the polynomial that is equal to 1; if
the bit is 0, it is not fed back into the register). In this case, the
polynomial represented is 1011.
5. Repeat steps 2-4 for every bit in the data stream.
6. The CRC is the final value in the register.
Let's try this with the data stream 11010111 and the polynomial 1011. The
result will be a 4-bit CRC.
The output stream to the left is the result of each addition operation
at the left-most gate. This is the value that is fed into the polynomial
feedback loop during the left shift.
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 0 |<- + <-| 0 |<------| 0 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 11010111
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
1 <- + <-| 1 |<- + <-| 0 |<------| 1 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
---- 1010111
- Page 6 -
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
10 <- + <-| 0 |<- + <-| 1 |<------| 1 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 010111
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
100 <- + <-| 1 |<- + <-| 1 |<------| 0 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 10111
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
1000 <- + <-| 1 |<- + <-| 0 |<------| 0 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 0111
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
10001 <- + <-| 1 |<- + <-| 0 |<------| 1 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
---- 111
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
100010 <- + <-| 0 |<- + <-| 1 |<------| 1 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 11
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
1000101 <- + <-| 0 |<- + <-| 1 |<------| 1 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
---- 1
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
10001011 <- + <-| 0 |<- + <-| 1 |<------| 0 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
----
The CRC is 0101.
- Page 7 -
What should be obvious at this point is that if a single bit in the
data stream is changed, the value in the CRC register is corrupted completely.
The feedback loop ensures that the error is propagated throughout the entire
calculation.
Most CRC algorithms use either a 16- or 32-bit polynomial. The longer
the polynomial, the more effective it is at catching errors; a 16-bit CRC, for
example, catches more than 99.99% of all random errors in a data stream.
All other CRC algorithms are analogous to the 4-bit algorithm
presented here. There are some optimizations that can process several bits at
a time; the source code included with this program uses a lookup table that
can process 8 bits at once. For further discussion of the CRC algorithm and
its variations, I recommend "C Programmer's Guide to Serial Communications" by
Joe Campbell, published by Howard W. Sams & Company.
- Page 8 -
Other Types of Self-Checking
Some viruses may not be detectable by checking the CRC of the file
(see "Limitations of Self-Checking" below for details), but there are some
obvious and simple tests we can perform to check for suspicious behaviour.
The first is to check the date and time of the file. DOS does
precious little checking when setting the date and time fields (unless it sets
them itself when creating the file) and so it is possible to fiddle around
with them. For example, there is a little-known seconds field in the time
stamp, and it is possible to set this as high as 62 seconds. The year can go
up to 2107. Some viruses mark their presence by changing these fields, and we
can check for such signs by doing a simple validation on the fields.
Another thing to watch for is some kind of consistency in the file
size. If the size returned by the directory information doesn't match the
size of the file when it is opened, something is wrong and we can signal that
kind of error.
- Page 9 -
Limitations of Self-Checking
Doing a CRC check on a file works for most viruses, and in fact a
number of programs do perform some kind of self-check to make sure that they
have not been infected. It wasn't long before a new kind of virus, the
stealth virus, hit the PC world.
Perhaps the best-known stealth virus is the "Hundred Years" virus.
The virus changes the date stamp of an infected file by adding one hundred to
the year portion. Since the DOS DIR command shows only the last two digits of
the year, this is not always obvious.
The stealth virus manages to hide itself from a CRC check by hooking
into the DOS interrupt and monitoring certain system calls. When the file
opens itself for a self-check, the virus first disinfects the file before
passing the file open call to the operating system. The CRC self-check is
then working with a clean file. The "Hundred Years" virus also monitors the
DOS DIR calls and, if it sees a file with the year greater than 2079,
subtracts one hundred from the year and subtracts 4096 from the size before
returning the directory information to the calling program. The net result is
that the calling program sees the directory information exactly as it would
appear if the virus had not in fact infected the file. Every test built into
the file checking algorithm in this package would pass with flying colours.
The "Hundred Years" virus is also particularly clever at hiding itself
from DOS mapping utilities. There is no indication, using regular DOS checks,
that the virus is in memory at all. All the interrupt pointers are pointing
where they should and no memory seems to be taken up by the virus at all. How
is this possible?
- Page 10 -
DOS Memory Control Blocks
DOS is a linear memory system. It was first designed, and still is
designed, with the idea that one program will sit on top of another in memory
and that the program at the top of memory will be the one that is currently
running. As a result, it has a very simple memory manager.
The DOS Memory Control Block, or MCB, contains everything there is to
know about a memory block: its owner and the number of paragraphs (16-byte
chunks) allocated to that block. The MCB takes up one paragraph and has the
following format:
BYTE ID 'M' for every block except the last, which is 'Z'.
WORD PSP The Program Segment Prefix of the program that owns the block;
0 if the block is free.
WORD SIZE The size of the block in paragraphs.
BYTE UNUSED[3] Unused.
BYTE NAME[8] The file name of the owning program; unused below DOS 4.x.
The only fields we are really interested in are the ID (so we know when we're
at the end of the chain) and the size (so we know how big it is and where the
next MCB is). To find the next MCB, add the size of the allocated block
plus one (for the MCB itself) to the current MCB segment address, i.e.
next MCB segment = current MCB segment + block size + 1
The MCB list covers all of memory, which means that the address
of the last block plus its size points to the end of usable memory, i.e.
last MCB address + its size + 1 = A000h = first paragraph beyond 640k.
Unfortunately, there are some exceptions to this nice and simple
memory layout. Some utilities extend DOS memory by remapping upper memory
(especially in 386 machines) to give DOS as much as 920k and may not always
update the BIOS memory to reflect the change. Some clones of MS-DOS may
differ by as much as 1k in the amount of memory they report versus the amount
of BIOS memory available, and DOS 5.0 memory differs from BIOS memory by only
one paragraph when it is loaded in the upper memory block (UMB). The long and
the short of this is that DOS memory may not always equal BIOS memory, but we
can adjust for this by saying that it may be no more than 1k short of the
memory reported by the BIOS.
Every program has at least two blocks allocated to it. The first, the
environment block, contains the environment (such as the PATH, COMSPEC, etc)
of the program. The second block is allocated to the code and data of the
program itself. Any other memory management that the program requires must be
done through DOS functions 48h (allocate memory), 49h (free memory), and 4Ah
(resize memory). When the program terminates, DOS walks through the list of
MCB's and frees all the memory associated with that program by releasing all
blocks with the PSP equal to that of the program.
The following diagram shows a simplified DOS MCB layout; a typical
system could have well over two dozen MCB's.
- Page 11 -
+===========+
MCB --> | ID | Size | DOS config
+-----------+
| Block | <-- `Size' paragraphs long.
+===========+
.
. More DOS config, COMMAND.COM, TSR's, etc.
.
+===========+
MCB --> | ID | Size | Program's environment block
+-----------+
| Block |
+===========+
MCB --> | ID | Size | Program
+-----------+
| Block |
+===========+
MCB --> | ID | Size | Free
+-----------+
| Block |
+===========+ <-- End of usable memory (typically 640k).
The MCB is one of many things that Microsoft has never documented but
which, like every other undocumented feature, has been ripped apart and
analyzed by countless hackers over the years. The best reference for the MCB
(and countless other undocumented but useful DOS features) is "Undocumented
DOS", by Andrew Schulman et al, published by Addison-Wesley Publishing
Company.
In brief, loading the AH register with 52h and calling the DOS
function interrupt returns a pointer to the DOS "list of lists" in ES:BX (see
"Undocumented DOS" for more information about this list). The segment address
of the first MCB is at ES:[BX-2].
- Page 12 -
Stealth Viruses
Many stealth viruses take advantage of the simplicity of the MCB to
hide themselves in upper memory. The virus gets control the moment the
program is loaded and the first thing it does is copy itself to the end of
memory. It then walks through the MCB list until it gets to the last MCB, and
shrinks the size field so that the end of memory, at least as far as DOS is
concerned, is 640k minus the size of the virus! DOS will never allocate
memory beyond what it knows to exist in the MCB list and so the virus is safe
from being overwritten.
The only way to tell that a virus has done this is to calculate the
DOS memory by walking the MCB chain and comparing it to the memory reported by
the BIOS, and this is exactly what the system check in this package does.
Once a virus has loaded itself into upper memory, it must then take
over the DOS interrupt 21h, and possibly some others, in order to monitor DOS
calls. If it simply redirects the interrupt vector, it would be easily found
by checking the vector and making sure that it is not pointing to something
beyond the program. Six critical interrupts, 21h (DOS function), 24h
(critical error), 25h (absolute disk read), 26h (absolute disk write), 1Ch
(user timer), and 28h (DOS OK) are checked in this way.
A better way to hijack an interrupt is to patch the interrupt code
itself with a far jump to the virus. This means simply replacing the first
five bytes of the DOS interrupt with "JMP FAR <addr>" where <addr> is the
address of the virus' interrupt handler. This is what happens when a DOS
function interrupt occurs:
1) control is transferred to the actual address of the DOS interrupt
handler;
2) a far jump is made to the virus;
3) the virus patches the DOS code with the five bytes that were
replaced by the far jump instruction;
4) the virus does any pre-DOS processing necessary based on the
function, such as disinfecting a file;
5) the virus passes the original request back to the DOS handler;
6) DOS processes the request and returns to the virus;
7) the virus does any post-DOS processing necessary based on the
function, such as changing the directory information returned by
DOS;
8) the virus replaces the first five bytes of the DOS code with the
far jump instruction; and
9) the virus returns control to the calling program.
The trick then is to look for the far jump instruction. Three other
instructions, a far call, an indirect far jump through a value stored in the
code segment, and an indirect far call through a value stored in the code
segment are also tested for.
It isn't enough to look for the far jump instruction: we also have to
figure out the destination of the jump. Some DOS clones move interrupt code
into the 64k block just above the 1 Megabyte limit on 286 and higher machines
and transfer control with a far jump instruction. Resident anti-virus
utilities redirect the DOS function interrupt and at least one of them has a
far jump as the first instruction in its interrupt handler. I can only assume
that this is to fool a stealth virus into thinking that it is already
installed. However, the destination of the far jump falls within the same MCB
- Page 13 -
as the interrupt vector. We can modify the test as follows:
1) check for a suspicious instruction, such as a far jump, at the
beginning of the interrupt handler;
2) get the destination address of the jump or call;
3) if the destination of the jump is above the memory limit (defined
as the maximum of DOS and BIOS memory), the redirection is valid;
4) else if the interrupt handler and the destination of the jump fall
within the same memory block, the redirection is valid, otherwise
it is invalid.
- Page 14 -
How CRCSET.EXE Works
CRCSET.EXE, provided with this package, calculates the CRC of a file
and stores it either in the file itself or in a separate external file.
The idea of storing a program's CRC in the executable file itself has
one drawback: since the CRC is part of the program, it becomes part of the
data stream that is used to calculate the CRC. In other words, you have to
know what the CRC of the program is in order to calculate it! At compile and
link time, this is downright impossible; changing the slightest thing in your
code will change the CRC the next time you recompile.
Most algorithms that store the CRC in the program itself get around
this drawback by breaking up the program into three chunks:
+------------------------+-----+------------------------+
| <-- Program part 1 --> | CRC | <-- Program part 2 --> |
+------------------------+-----+------------------------+
The CRC is then calculated as the concatenation of parts 1 and 2, i.e. when
the CRC is calculated, it skips over itself completely in the calculation.
What it sees is this:
+------------------------+------------------------+
| <-- Program part 1 --> | <-- Program part 2 --> |
+------------------------+------------------------+
In order for a virus to infect any program that uses this method, it
must somehow find the location of the CRC within the file and recalculate the
CRC using the following data stream:
+------------------------+------------------------+---------------+
| <-- Program part 1 --> | <-- Program part 2 --> | <-- Virus --> |
+------------------------+------------------------+---------------+
It must then overwrite the old CRC with the new one.
I won't explain how (I don't want to give any virus-writers any
ideas), but with the right technique the CRC can be found, recalculated, and
rewritten in under 30 seconds.
CRCSET overcomes this limitation by making both the polynomial and the
CRC part of the data stream. In order to calculate the CRC, CRCSET looks for
a predefined string in the program (the default is _STEALTH), replaces the
first four bytes with a 32-bit polynomial, sets the next four bytes (the true
CRC) to 0, and calculates an intermediate CRC assuming that the true CRC is 0.
Then, through the magic of matrix algebra, CRCSET calculates what the true CRC
should have been in order to yield itself instead of the intermediate CRC at
the end. Let's take a look at a 4-bit CRC calculation as an example.
Let's assume that the polynomial in use is 1011, that the CRC
calculated up to the point where we reach the search string (represented by
the bit pattern STUVWXYZ) is 0010, and that the bit pattern 1100 follows the
search string:
- Page 15 -
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 0 |<- + <-| 0 |<------| 1 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- STUVWXYZ1100
1. Replace the first four bits (STUV) with the CRC polynomial (1011):
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 0 |<- + <-| 0 |<------| 1 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 1011WXYZ1100
2. Calculate the value of the CRC register with the polynomial in the data
stream:
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 1 |<- + <-| 0 |<------| 0 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
---- WXYZ1100
3. Replace the next four bits (WXYZ) with simple variables (X3, X2, X1, X0):
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 1 |<- + <-| 0 |<------| 0 |<- + <-| 1 |<- +
^ ----- ----- ----- -----
|
---- (X3)(X2)(X1)(X0)1100
4. Propagate X3+(bit 3) through the feedback loop:
---------------1-----------0------------1--------------1
| 3 | 2 1 | 0 |
| -------- v ----- ------ v -------- v
+ <-| X3+1 |<- + <-| 0 |<------| X3 |<- + <-| X3+1 |<- +
^ -------- ----- ------ --------
|
---- (X2)(X1)(X0)1100
- Page 16 -
5. Propagate X2+(bit 3) through the feedback loop:
------------------1------------0------------1-----------------1
| 3 | 2 1 | 0 |
| ----------- v ------ ------ v ----------- v
+ <-| X3+X2+1 |<- + <-| X3 |<------| X2 |<- + <-| X3+X2+1 |<- +
^ ----------- ------ ------ -----------
|
---- (X1)(X0)1100
In bit 1, for example, we have (X2+(bit 3))+(bit 0) = (X2+X3+1)+(X3+1) = X2
since the X3 terms cancel, no matter what the value of X3 is.
6. Propagate X1+(bit 3) through the feedback loop:
------------------1------------0------------1--------------------1
| 3 | 2 1 | 0 |
| ----------- v ------ ------ v -------------- v
+ <-| X2+X1+1 |<- + <-| X2 |<------| X1 |<- + <-| X3+X2+X1+1 |<- +
^ ----------- ------ ------ --------------
|
---- (X0)1100
7. Propagate X0+(bit 3) through the feedback loop:
------------------1------------0---------------1--------------------1
| 3 | 2 1 | 0 |
| ----------- v ------ --------- v -------------- v
+ <-| X1+X0+1 |<- + <-| X1 |<------| X3+X0 |<- + <-| X2+X1+X0+1 |<- +
^ ----------- ------ --------- --------------
|
---- 1100
8. Propagate the next bit through the feedback loop:
-------------1---------------0--------------1---------------1
| 3 | 2 1 | 0 |
| ------ v --------- -------- v --------- v
+ <-| X0 |<- + <-| X3+X0 |<------| X2+1 |<- + <-| X1+X0 |<- +
^ ------ --------- -------- ---------
|
---- 100
9. Repeat step 8 for all remaining bits:
---------------------1---------------0--------------1---------------1
| 3 | 2 1 | 0 |
| -------------- v --------- -------- v --------- v
+ <-| X3+X2+X1+1 |<- + <-| X3+X0 |<------| X2+1 |<- + <-| X3+X2 |<- +
^ -------------- --------- -------- ---------
|
----
- Page 17 -
We want the CRC in the register to be equal to the unknown CRC we
started inserting at step 4, i.e. we need:
N Value calculated for bit N Bit N
--- -------------------------- -----
3 X3 + X2 + X1 + 1 = X3
2 X3 + X0 = X2
1 X2 + 1 = X1
0 X3 + X2 = X0
If we collect all the variables on the left and all the constants on the
right (keeping in mind that we are dealing with modulo-2 arithmetic):
X2 + X1 = 1
X3 + X2 + X0 = 0
X2 + X1 = 1
X3 + X2 + X0 = 0
The value 1010 is the intermediate CRC mentioned earlier.
Here we have an interesting situation. The first and third equations
are the same and so are the second and fourth. What we come down to is this:
X2 + X1 = 1
X3 + X2 + X0 = 0
We have four variables and only two equations. There is no unique solution;
in fact, there are four (2 to the power of (4 - number of independent
equations)) separate and distinct sets of values that will satisfy these
equations.
Since CRCSET needs a numeric solution, we have to arbitrarily set bits
to get one. For arguments sake, let's set X2 to 1.
1 + X1 = 1
X3 + 1 + X0 = 0
In other words:
X1 = 0
X3 + + X0 = 1
By setting X2 to 1, we have also fixed X1. Now let's set X0 to 0.
X3 + + 0 = 1
In other words:
X3 = 1
We now have a solution for the CRC of the program: 1100. There are three
others: 0101, 0010, and 1011. If we replace the string WXYZ with any of these
values, the CRC calculation process will yield that value at the end, e.g.:
- Page 18 -
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 0 |<- + <-| 0 |<------| 1 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
---- 101111001100
----
yields
------------1-----------0-----------1-----------1
| 3 | 2 1 | 0 |
| ----- v ----- ----- v ----- v
+ <-| 1 |<- + <-| 1 |<------| 0 |<- + <-| 0 |<- +
^ ----- ----- ----- -----
|
----
If you're not sure about this, try it with pen and paper. Plug in each of the
four values and you should get that same value at the end of the CRC
calculation process. To help you out, here are the values of the CRC register
for each step of the solution (the first value is the value after step 2 of
the calculation):
CRC
-----
1100: 1001, 0010, 1111, 0101, 1010, 0100, 0011, 0110, 1100
0101: 1001, 1001, 0010, 0100, 0011, 1101, 1010, 1111, 0101
0010: 1001, 1001, 1001, 0010, 0100, 0011, 1101, 0001, 0010
1011: 1001, 0010, 0100, 0011, 1101, 1010, 0100, 1000, 1011
The fact that there is not a unique solution isn't really important;
only about 30% of the time will there be a unique solution. This does not
diminish the effectiveness of the CRC calculation because whichever of the
four values the CRC is set to, any virus installing itself in the program will
still change it. The fact that we did not get a unique solution does mean,
however, that it is possible to get the following situation:
X2 + X1 = 1
X3 + X2 + X0 = 1
X2 + X1 = 1
X3 + X2 + X0 = 0
Here equations 2 and 4 contradict each other. There are no values of X3 to X0
that will satisfy these equations. If the CRCSET program comes across this
situation, it will simply try again with another polynomial.
For illustration, I have used only a 4-bit CRC; the CRCSET algorithm
uses 32 bits. The principle is the same; it just takes more time (and ink,
paper, patience, caffeine, pizza, and chocolate chip cookies).
Since a software package often consists of more than just a single
executable, CRCSET has also been given the ability to calculate the CRC's for
other support files. It can store the CRC's in either an external data file
or in the main program file itself. If the CRC's are to be stored in the main
program file, CRCSET still searches for the _STEALTH string. The CRC's for
all the other files on the command line are calculated and written in the
- Page 19 -
order that the files were specified to the locations immediately after the
_STEALTH string. Once all the files have been exhausted, the CRC for the main
program file itself is calculated using the above technique.
To be sure that CRCSET doesn't overwrite anything critical in your
application, you have to define an array of polynomials and CRC's in your
program large enough to accommodate all the files in your package for which
you want to do a self-check. This is outlined in the next section.
- Page 20 -
How to Use Stealth Bomber - C
This code was written under Borland C++ 3.0 and Microsoft C 5.1.
Add the files SYSCHECK.C, DOSMCB.C, FILECHCK.C, CALCCRC.C, VIRUSDAT.C,
and BUFALLOC.C to the list of files required to build the program you are
working on (in Turbo C++, for example, add them to the project file).
To perform a system check, add a call to stealth_sys_check() somewhere
in your program, preferably before you install any interrupt handlers. The
function stealth_sys_check() returns a bit pattern composed of the following
flags:
- STEALTH_INTR_ERR (0x0001) if interrupts have been set beyond the
program's code space,
- STEALTH_DOS_MEM_ERR (0x0002) if DOS memory is inconsistent with BIOS
memory, and
- STEALTH_DOS_HIJACKED (0x0004) if any interrupt has been hijacked by
a JMP FAR or CALL FAR instruction.
The function returns STEALTH_OK if all tests pass.
To check files, add a call to stealth_file_check(filename, filecrc)
for every file you want to verify. If the file name contains a drive and/or
directory, the name is taken as is. If the file name does not contain a drive
or directory specifier, the file is searched for first in the program's home
directory and then, if not found, in the DOS PATH. The function
stealth_file_check() returns a bit pattern composed of the following flags:
- STEALTH_FILE_ERR (0x0001) if the file was not found or couldn't be
opened,
- STEALTH_FILE_DATE_ERR (0x0002) if the file's date/time stamp was
invalid,
- STEALTH_FILE_SIZE_ERR (0x0004) if the file size was inconsistent
between directory and file open checks,
- STEALTH_CRC_BAD_POLY (0x0008) if the CRC polynomial is invalid,
- STEALTH_NO_MEM (0x0010) if there was no memory to perform the CRC
check, and
- STEALTH_CRC_INVALID (0x0020) if the CRC is invalid.
The function returns STEALTH_OK if all tests pass.
Return values and function prototypes are defined in the header file
VIRCHECK.H.
This version of CRCSET provides two ways to store the CRC. The first
is to store the CRC of each file that you want to protect in one of the
executables, which can then take care of validating the entire package before
loading any of the other files. This executable would be the "primary file"
discussed below.
To do this, you would have to declare an array of type filecrc large
enough to hold all of the CRC's; no checking can be done by the CRCSET program
to prevent it from overwriting critical data if the array isn't large enough.
The simplest way to declare this array is to change the STEALTH_NFILES
constant in VIRUSDAT.C to match the number of files in your project. Leave
the rest of the file as it is; the C compiler will automatically fill in the
rest of the array with zeros. When CRCSET is run, it will calculate the CRC
- Page 21 -
for each program in the list and store it in the primary file. It will then
calculate the CRC for the primary file using the matrix algebra above and
write it to the first element of the array (at index 0).
The second way to store the CRC data is in an external data file. The
data file is specified to CRCSET on the command line and each CRC is written
to that file in sequence. The data file overwrites any file of the same name
and the CRC for the data file is not calculated. To perform a file check, you
would first have to read the CRC data from the data file and then call
stealth_file_check() with the appropriate file name and CRC from the data
file.
To validate only the running program, the following call in main()
should suffice:
stealth_file_check(_osmajor >= 3 ? argv[0] : "progname.exe", _fcrc[0])
where _fcrc is declared in VIRUSDAT.C.
Under DOS 3.0 and above, the program name is stored in argv[0]. If
the program is running under DOS 2.x, you have to explicitly pass the program
name to the function and hope that it is in the path and hasn't been renamed
by the user.
A sample program TESTVIR.C has been provided. The syntax required to
set the CRC's for TESTVIR.EXE and its supporting files is in the header of
TESTVIR.C. It demonstrates all the functions and uses of Stealth Bomber and
can be used as a framework for your own programs.
If you run TESTVIR before running CRCSET on it, TESTVIR will abort
with a warning that it and its supporting files may have been infected. After
you set the CRC, run TESTVIR to assure yourself that the CRC's are valid.
One final note: in order for these routines to work properly, they
must be compiled with word alignment off. If you compile with word alignment
on, modules that use DOSMCB will have the wrong layout for the MCB (i.e. the
MCB will be off by one byte).
- Page 22 -
How to Use Stealth Bomber - Turbo Pascal
This code was written under Turbo Pascal 5.5.
Add the VIRCHECK unit to the "Uses" clause somewhere in your program.
To perform a system check, add a call to StealthSysCheck somewhere in
your program, preferably before you install any interrupt handlers. The
function StealthSysCheck returns a bit pattern composed of the following
flags:
- StealthIntrErr ($0001) if interrupts have been set beyond the
program's code space,
- StealthDOSMemErr ($0002) if DOS memory is inconsistent with BIOS
memory, and
- StealthDOSHijacked ($0004) if any interrupt has been hijacked by a
JMP FAR or CALL FAR instruction.
The function returns StealthOK if all tests pass.
To check files, add a call to StealthFileCheck(FileName, FileCRC) for
every file you want to verify. If the file name contains a drive and/or
directory, the name is taken as is. If the file name does not contain a drive
or directory specifier, the file is searched for first in the program's home
directory and then, if not found, in the DOS PATH. The function
StealthFileCheck returns a bit pattern composed of the following flags:
- StealthFileErr ($0001) if the file was not found or couldn't be
opened,
- StealthFileDateErr ($0002) if the file's date/time stamp was
invalid,
- StealthFileSizeErr ($0004) if the file size was inconsistent between
directory and file open checks,
- StealthCRCBadPoly ($0008) if the CRC polynomial is invalid,
- StealthNoMem ($0010) if there was no memory to the CRC check, and
- StealthCRCInvalid ($0020) if the CRC is invalid.
The function returns StealthOK if all tests pass.
This version of CRCSET provides two ways to store the CRC. The first
is to store the CRC of each file that you want to protect in one of the
executables, which can then take care of validating the entire package before
loading any of the other files. This executable would be the "primary file"
discussed below.
To do this, you would have to declare an array of type FileCRC large
enough to hold all of the CRC's; no checking can be done by the CRCSET program
to prevent it from overwriting critical data if the array isn't large enough.
The simplest way to declare this array is to change the StealthNFiles constant
in VIRUSDAT.PAS to match the number of files in your project and to fill in
the rest of the array with the lines shown in VIRUSDAT.PAS to match the size
of the array. When CRCSET is run, it will calculate the CRC for each program
in the list and store it in the primary file. It will then calculate the CRC
for the primary file using the matrix algebra above and write it to the first
element of the array (at index 1).
The second way to store the CRC data is in an external data file. The
data file is specified to CRCSET on the command line and each CRC is written
- Page 23 -
to that file in sequence. The data file overwrites any file of the same name
and the CRC for the data file is not calculated. To perform a file check, you
would first have to read the CRC data from the data file and then call
StealthFileCheck with the appropriate file name and CRC from the data file.
To validate only the running program, the following call in your main
module should suffice:
if Lo(DosVersion) >= 3 then
Result := StealthFileCheck(ParamStr(0), _FCRC[1])
else
Result := StealthFileCheck('progname.exe', _FCRC[1]);
where _FCRC is declared in VIRUSDAT.PAS.
Under DOS 3.0 and above, the program name is stored in ParamStr(0).
If the program is running under DOS 2.x, you have to explicitly pass the
program name to the function and hope that it is in the path and hasn't been
renamed by the user.
A sample program TESTVIR.PAS has been provided. The syntax required
to set the CRC's for TESTVIR.EXE and its supporting files is in the header of
TESTVIR.PAS. It demonstrates all the functions and uses of Stealth Bomber and
can be used as a framework for your own programs.
If you run TESTVIR before running CRCSET on it, TESTVIR will abort
with a warning that it and its supporting files may have been infected. After
you set the CRC, run TESTVIR to assure yourself that the CRC's are valid.
One final note: the CRC routines try to allocate a buffer for reading
the file and adjust the size of the buffer to match the amount of free memory.
For this to work properly (i.e. without aborting with a run-time error of not
enough memory), you will have to install a heap error handler (look up
HeapError in the index of the "Turbo Pascal Reference Guide"). The example
file TESTVIR.PAS has a simple handler installed.
- Page 24 -
CRCSET.EXE Syntax and Messages
Once you have compiled your program, you have to calculate its CRC.
The program CRCSET.EXE has been provided for this purpose. The syntax is:
crcset [-q] [-s search string] [-p primary file | -d data file] [file]
[file] ... [-s search string] ...
-q = run quiet and redirect messages and errors to files.
search string = string to search for when writing CRC to a primary file.
primary file = primary file, usually an executable, to which to write its own
CRC and the CRC of all "file" parameters that follow.
data file = data file to which to write the CRC of all "file" parameters
that follow; overwrites any file of the same name.
file = file for which to calculate the CRC; if no data or primary
file has been specified, the CRC is written to this file.
Files are processed in groups. Specifying a new search string, primary file,
or data file closes the current group and starts a new one.
The string for which CRCSET searches is stored in _fcrc[0] in C and
_FCRC[1] in Turbo Pascal. The default is _STEALTH but you may change it if
there is a conflict (i.e. if there is more than one instance of _STEALTH in
the program, CRCSET will not know which one holds the CRC and so will not set
it). CRCSET replaces the string with a randomly-generated polynomial and the
CRC itself and adds the polynomials and CRC's of any other files in the group
after the location of the search string.
For example, to set the CRC for the C sample program, the command is:
crcset -p testvir.exe testvir.obj -d crc.dat testvir.c
This will write the CRC of testvir.obj to _fcrc[1], testvir.exe to _fcrc[0],
and testvir.c to the file CRC.DAT. The object and source files are shown here
as examples of supporting files for which you may want to run a CRC check.
Supporting files could be files like overlays, dynamic link libraries, or
configuration files.
If you want to test the reliability of the CRC check, change a few
bytes in TESTVIR.EXE, TESTVIR.OBJ, or TESTVIR.C (TESTVIR.C is the safest).
Run TESTVIR again, and it should warn you that one of the files may have been
infected.
If you changed the default search string to something like MyName, you
would set the CRC's as follows:
crcset -s MyName -p testvir.exe testvir.obj -d crc.dat testvir.c
The case of the string on the command line must match exactly the case of the
string in the program. Also, any strings shorter than 8 characters must be
padded with 0's (ASCII 0, not the character '0') in the program.
The quiet option ('-q') is present principally for developers. Some
packages store the user's name and registration number in the executable
itself. If the executable has previously had CRCSET run on it, writing this
information will change the CRC. The quiet option allows CRCSET to run
without displaying any messages on the screen; normal output goes to
CRCSET.OUT and error messages go to CRCSET.ERR. CRCSET will return an error
code to DOS if anything goes wrong so your installation program can verify
- Page 25 -
that everything worked correctly.
To use the quiet option, your package should be distributed _without_
having had CRCSET run on it. I recommend that you change the default search
string to a jumble of numbers, letters, and punctuation characters to reduce
the chance of anyone using the search string as part of the registration
information. In the installation procedure, you would first write the user
information to the executable and then run CRCSET with the quiet option to set
the CRC for your program and its supporting files. See the license on page 2
for restrictions on using Stealth Bomber in this way.
Despite its complexity, CRCSET.EXE takes only a few seconds to
calculate the CRC of the target file. I have made some optimizations to the
algorithm that make the calculation time almost constant regardless of the
size of the file. Once a CRC has been determined for your program, it takes
little time for the validation function to verify it every time the program is
run.
CRCSET will display any of the following messages. These messages,
when run with the quiet option, will appear either in CRCSET.OUT or
CRCSET.ERR.
File=[file], polynomial=1234abcd, CRC=5678ef90.
The CRC for [file] under the polynomial 1234abcd is 5678ef90.
(CRCSET.OUT)
File=[file], polynomial=1234abcd, CRC=5678ef90 (unique).
The CRC for [file] under the polynomial 1234abcd is 5678ef90.
This message is for a primary file specified with the '-p'
option. The CRC shown is a unique solution for the matrix.
This will occur only about 30% of the time. (CRCSET.OUT)
File=[file], polynomial=1234abcd, CRC=5678ef90 (2^N solutions).
The CRC for [file] under the polynomial 1234abcd is 5678ef90.
This message is for a primary file specified with the '-p'
option. The CRC shown is not a unique solution; there are 2^N
possible solutions to the matrix. This does not diminish the
effectiveness of the CRC. (CRCSET.OUT)
Quiet option must be the first option specified.
The option '-q', if specified at all, should be specified
first. (CRCSET.ERR)
No files specified for processing.
No files were passed on the command line for CRC calculation.
(CRCSET.ERR)
No search string specified for -s option.
The '-s' option was specified without a parameter.
(CRCSET.ERR)
No data file name specified for '-d' option.
- Page 26 -
The '-d' option was specified without a parameter.
(CRCSET.ERR)
Invalid option -X.
This is a catch-all for any option that CRCSET doesn't
recognize. (CRCSET.ERR)
File [file] not found.
The file specified for processing doesn't exist or couldn't be
opened. (CRCSET.ERR)
Primary file [file] not found.
The primary file specified for processing doesn't exist or
couldn't be opened. (CRCSET.ERR)
Unable to create data file [file].
The data file couldn't be created. Either a file of the same
name exists and has the DOS read-only bit set or the file name
has invalid characters in it. (CRCSET.ERR)
Unable to allocate buffer for file [file].
There was not enough memory to allocate a read buffer for the
file. (CRCSET.ERR)
Unable to allocate buffer for primary file [file].
There was not enough memory to allocate a read buffer for the
primary file. (CRCSET.ERR)
Search string [search string] not found in file [file].
The search string (usually "_STEALTH") was not found in the
primary file. To fix this, either add VIRUSDAT.C to the
project or add VIRUSDAT.PAS to the "uses" clause. Also make
sure that the search string passed with the '-s' option, if
any, is correct. (CRCSET.ERR)
Search string [search string] found more than once in file [file].
The search string (usually "_STEALTH") was found more than
once in the primary file. To fix this, change the default
search string in VIRUSDAT.C or VIRUSDAT.PAS and pass the new
search string to CRCSET with the '-s' option. (CRCSET.ERR)
- Page 27 -
Vulnerability
The Stealth Bomber algorithm, like every other anti-virus algorithm,
is vulnerable to attack. Hand-tweaking the code to bypass the virus
protection is always possible. Direct attack to determine the storage
location of the polynomial and the CRC and to change it is also possible, but,
on a program of any reasonable size (greater than 20k), this can take upwards
of half an hour on a 386. Any virus that ties up the computer for that long
wins no points for discretion. Any user that doesn't do anything about a
system lockup lasting over 30 seconds probably has many other doors open for
viruses anyway. :-)
Viruses have a two advantages: they are loaded first before the
program can perform a self-check and the virus writers have access to my code
whereas I don't have access to theirs.
The first advantage can be pretty well overcome by using an anti-virus
sentinel program that constantly monitors a system for suspicious activity.
Most people still don't use them, hence the need for this code. The second
advantage I can do nothing about; by explaining my anti-virus methods and
distributing the source, I leave things wide open for attack. Build a better
mousetrap and someone is bound to build a better mouse. For as long as the
vandal mentality exists, we're stuck with viruses.
Stealth Bomber performs only the most basic memory check and only
checks the files you specify to it. As a result, a virus already in memory
that has come in from another program or possibly from the boot sector of the
disk will probably not be detected.
There is no substitute for proper precautions: downloading from a
reputable BBS, avoiding pirated software, scanning programs for viruses before
using them, and so on. This program was developed with the knowledge that
most people don't take these precautions (based on a sample size of at least 1
- me); rather than leave it up to the end user to protect against viruses,
with this we programmers can take on some of the burden by protecting the
programs we write against them.
- Page 28 -